Search CORE

5 research outputs found

Tackling Choke Point Induced Performance Bottlenecks in a Near-Threshold GPGPU

Author: Shabanian Tahmoures
Publication venue: DigitalCommons@USU
Publication date: 01/08/2018
Field of study

Over the last decade, General Purpose Graphics Processing Units (GPGPUs) have garnered a substantial attention in the research community due to their extensive thread-level parallelism. GPGPUs provide a remarkable performance improvement over Central Processing Units (CPUs), for highly parallel applications. However, GPGPUs typically achieve this extensive thread-level parallelism at the cost of a large power consumption. Consequently, Near-Threshold Computing (NTC) provides a promising opportunity for designing energy-efficient GPGPUs (NTC-GPUs). However, NTC-GPUs suffer from a crucial Process Variation (PV)-inflicted performance bottleneck, which is called Choke Point. Choke Point is defined as one or small group of gates which is affected by PV. Choke Point is capable of varying the path-delay of circuit and causing different forms of timing violation. In this work, a cross-layer design technique is proposed to tackle the performance impediments caused by choke points in NTC-GPUs

DigitalCommons@USU

Challenges and Opportunities in Near-Threshold DNN Accelerators around Timing Errors

Author: Basu Prabal
Chakraborty Koushik
Gundi Noel Daniel
Pandey Pramesh
Patrick Mitchell Craig
Roy Sanghamitra
Shabanian Tahmoures
Publication venue: Hosted by Utah State University Libraries
Publication date: 16/10/2020
Field of study

AI evolution is accelerating and Deep Neural Network (DNN) inference accelerators are at the forefront of ad hoc architectures that are evolving to support the immense throughput required for AI computation. However, much more energy efficient design paradigms are inevitable to realize the complete potential of AI evolution and curtail energy consumption. The Near-Threshold Computing (NTC) design paradigm can serve as the best candidate for providing the required energy efficiency. However, NTC operation is plagued with ample performance and reliability concerns arising from the timing errors. In this paper, we dive deep into DNN architecture to uncover some unique challenges and opportunities for operation in the NTC paradigm. By performing rigorous simulations in TPU systolic array, we reveal the severity of timing errors and its impact on inference accuracy at NTC. We analyze various attributes—such as data–delay relationship, delay disparity within arithmetic units, utilization pattern, hardware homogeneity, workload characteristics—and uncover unique localized and global techniques to deal with the timing errors in NTC

DigitalCommons@USU

EFFORT: A Comprehensive Technique to Tackle Timing Violations and Improve Energy Efficiency of Near-Threshold Tensor Processing Units

Author: Basu Prabal
Chakraborty Koushik
Gundi Noel Daniel
Pandey Pramesh
Roy Sanghamitra
Shabanian Tahmoures
Publication venue: Hosted by Utah State University Libraries
Publication date: 01/10/2021
Field of study

Modern deep neural network (DNN) applications demand a remarkable processing throughput usually unmet by traditional Von Neumann architectures. Consequently, hardware accelerators, comprising a sea of multiplier-and-accumulate (MAC) units, have recently gained prominence in accelerating DNN inference engine. For example, tensor processing units (TPUs) account for a lion\u27s share of Google\u27s datacenter inference operations. The proliferation of real-time DNN predictions is accompanied by a tremendous energy budget. In quest of trimming the energy footprint of DNN accelerators, we propose Energy eFFicient and errOr Resilient TPU (EFFORT) - an energy optimized, yet high-performance TPU architecture, operating at the near-threshold computing (NTC) region. EFFORT promotes a better-than-worst case design by operating the NTC TPU at a substantially high frequency while keeping the voltage at the NTC nominal value. In order to tackle the timing errors due to such aggressive operation, we employ an opportunistic error mitigation strategy. In addition, we implement an in situ clock gating architecture, drastically reducing the MACs\u27 dynamic power consumption. Compared to a cutting-edge error mitigation technique for TPUs, EFFORT enables up to

2.5\times

better performance at NTC with only 4% average accuracy drop across six out of eight DNN benchmarks

DigitalCommons@USU

EFFORT: Enhancing Energy Efficiency and Error Resilience of a Near-Threshold Tensor Processing Unit

Author: Basu Prabal
Chakraborty Koushik
Gundi Noel Daniel
Pandey Pramesh
Roy Sanghamitra
Shabanian Tahmoures
Zhang Zhen
Publication venue: Hosted by Utah State University Libraries
Publication date: 13/01/2020
Field of study

Modern deep neural network (DNN) applications demand a remarkable processing throughput usually unmet by traditional Von Neumann architectures. Consequently, hardware accelerators, comprising a sea of multiplier and accumulate (MAC) units, have recently gained prominence in accelerating DNN inference engine. For example, Tensor Processing Units (TPU) account for a lion\u27s share of Google\u27s datacenter inference operations. The proliferation of real-time DNN predictions is accompanied with a tremendous energy budget. In quest of trimming the energy footprint of DNN accelerators, we propose EFFORT-an energy optimized, yet high performance TPU architecture, operating at the Near-Threshold Computing (NTC) region. EFFORT promotes a better-than-worst-case design by operating the NTC TPU at a substantially high frequency while keeping the voltage at the NTC nominal value. In order to tackle the timing errors due to such aggressive operation, we employ an opportunistic error mitigation strategy. Additionally, we implement an in-situ clock gating architecture, drastically reducing the MACs\u27 dynamic power consumption. Compared to a cutting-edge error mitigation technique for TPUs, EFFORT enables up to 2.5× better performance at NTC with only 2% average accuracy drop across 3 out of 4 DNN datasets

DigitalCommons@USU

Challenges and Opportunities in Near-Threshold DNN Accelerators around Timing Errors

Author: Koushik Chakraborty
Mitchell Craig Patrick
Noel Daniel Gundi
Prabal Basu
Pramesh Pandey
Sanghamitra Roy
Tahmoures Shabanian
Publication venue: 'MDPI AG'
Publication date: 16/10/2020
Field of study

Multidisciplinary Digital Publishing Institute